484 research outputs found

    Error and Error Mitigation in Low-Coverage Genome Assemblies

    Get PDF
    The recent release of twenty-two new genome sequences has dramatically increased the data available for mammalian comparative genomics, but twenty of these new sequences are currently limited to ~2× coverage. Here we examine the extent of sequencing error in these 2× assemblies, and its potential impact in downstream analyses. By comparing 2× assemblies with high-quality sequences from the ENCODE regions, we estimate the rate of sequencing error to be 1–4 errors per kilobase. While this error rate is fairly modest, sequencing error can still have surprising effects. For example, an apparent lineage-specific insertion in a coding region is more likely to reflect sequencing error than a true biological event, and the length distribution of coding indels is strongly distorted by error. We find that most errors are contributed by a small fraction of bases with low quality scores, in particular, by the ends of reads in regions of single-read coverage in the assembly. We explore several approaches for automatic sequencing error mitigation (SEM), making use of the localized nature of sequencing error, the fact that it is well predicted by quality scores, and information about errors that comes from comparisons across species. Our automatic methods for error mitigation cannot replace the need for additional sequencing, but they do allow substantial fractions of errors to be masked or eliminated at the cost of modest amounts of over-correction, and they can reduce the impact of error in downstream phylogenomic analyses. Our error-mitigated alignments are available for download.National Science Foundation (U.S.) (Faculty Early Career Development grant DBI-0644111)National Science Foundation (U.S.) (Faculty Early Career Development grant DBI-0644282)National Science Foundation (U.S.) (Faculty Early Career Development grant U54 HG004555-01)David & Lucile Packard FoundationDavid & Lucile Packard Foundation (Fellowship for Science and Engineering

    Iterative simulations to estimate the elastic properties from a series of MRI images followed by MRI-US validation

    Get PDF
    The modeling of breast deformations is of interest in medical applications such as image-guided biopsy, or image registration for diagnostic purposes. In order to have such information, it is needed to extract the mechanical properties of the tissues. In this work, we propose an iterative technique based on finite element analysis that estimates the elastic modulus of realistic breast phantoms, starting from MRI images acquired in different positions (prone and supine), when deformed only by the gravity force. We validated the method using both a single-modality evaluation in which we simulated the effect of the gravity force to generate four different configurations (prone, supine, lateral, and vertical) and a multi-modality evaluation in which we simulated a series of changes in orientation (prone to supine). Validation is performed, respectively, on surface points and lesions using as ground-truth data from MRI images, and on target lesions inside the breast phantom compared with the actual target segmented from the US image. The use of pre-operative images is limited at the moment to diagnostic purposes. By using our method we can compute patient-specific mechanical properties that allow compensating deformations

    Analytical derivation of elasticity in breast phantoms for deformation tracking

    Get PDF
    Patient-specific biomedical modeling of the breast is of interest for medical applications such as image registration, image guided procedures and the alignment for biopsy or surgery purposes. The computation of elastic properties is essential to simulate deformations in a realistic way. This study presents an innovative analytical method to compute the elastic modulus and evaluate the elasticity of a breast using magnetic resonance (MRI) images of breast phantoms.An analytical method for elasticity computation was developed and subsequently validated on a series of geometric shapes, and on four physical breast phantoms that are supported by a planar frame. This method can compute the elasticity of a shape directly from a set of MRI scans. For comparison, elasticity values were also computed numerically using two different simulation software packages.Application of the different methods on the geometric shapes shows that the analytically derived elongation differs from simulated elongation by less than 9% for cylindrical shapes, and up to 18% for other shapes that are also substantially vertically supported by a planar base. For the four physical breast phantoms, the analytically derived elasticity differs from numeric elasticity by 18% on average, which is in accordance with the difference in elongation estimation for the geometric shapes. The analytic method has shown to be multiple orders of magnitude faster than the numerical methods.It can be concluded that the analytical elasticity computation method has good potential to supplement or replace numerical elasticity simulations in gravity-induced deformations, for shapes that are substantially supported by a planar base perpendicular to the gravitational field. The error is manageable, while the calculation procedure takes less than one second as opposed to multiple minutes with numerical methods. The results will be used in the MRI and Ultrasound Robotic Assisted Biopsy (MURAB) project

    Multi-Objective and Multidisciplinary Design Optimisation of Unmanned Aerial Vehicle Systems using Hierarchical Asynchronous Parallel Multi-Objective Evolutionary Algorithms

    Get PDF
    The overall objective of this research was to realise the practical application of Hierarchical Asynchronous Parallel Evolutionary Algorithms for Multi-objective and Multidisciplinary Design Optimisation (MDO) of UAV Systems using high fidelity analysis tools. The research looked at the assumed aerodynamics and structures of two production UAV wings and attempted to optimise these wings in isolation to the rest of the vehicle. The project was sponsored by the Asian Office of the Air Force Office of Scientific Research under contract number AOARD-044078. The two vehicles wings which were optimised were based upon assumptions made on the Northrop Grumman Global Hawk (GH), a High Altitude Long Endurance (HALE) vehicle, and the General Atomics Altair (Altair), Medium Altitude Long Endurance (MALE) vehicle. The optimisations for both vehicles were performed at cruise altitude with MTOW minus 5% fuel and a 2.5g load case. The GH was assumed to use NASA LRN 1015 aerofoil at the root, crank and tip locations with five spars and ten ribs. The Altair was assumed to use the NACA4415 aerofoil at all three locations with two internal spars and ten ribs. Both models used a parabolic variation of spar, rib and wing skin thickness as a function of span, and in the case of the wing skin thickness, also chord. The work was carried out by integrating the current University of Sydney designed Evolutionary Optimiser (HAPMOEA) with Computational Fluid Dynamics (CFD) and Finite Element Analysis (FEA) tools. The variable values computed by HAPMOEA were subjected to structural and aerodynamic analysis. The aerodynamic analysis computed the pressure loads using a Boeing developed Morino class panel method code named PANAIR. These aerodynamic results were coupled to a FEA code, MSC.Nastran® and the strain and displacement of the wings computed. The fitness of each wing was computed from the outputs of each program. In total, 48 design variables were defined to describe both the structural and aerodynamic properties of the wings subject to several constraints. These variables allowed for the alteration of the three aerofoil sections describing the root, crank and tip sections. They also described the internal structure of the wings allowing for variable flexibility within the wing box structure. These design variables were manipulated by the optimiser such that two fitness functions were minimised. The fitness functions were the overall mass of the simulated wing box structure and the inverse of the lift to drag ratio. Furthermore, six penalty functions were added to further penalise genetically inferior wings and force the optimiser to not pass on their genetic material. The results indicate that given the initial assumptions made on all the aerodynamic and structural properties of the HALE and MALE wings, a reduction in mass and drag is possible through the use of the HAPMOEA code. The code was terminated after 300 evaluations of each hierarchical level due to plateau effects. These evolutionary optimisation results could be further refined through a gradient based optimiser if required. Even though a reduced number of evaluations were performed, weight and drag reductions of between 10 and 20 percent were easy to achieve and indicate that the wings of both vehicles can be optimised

    Accurate reconstruction of insertion-deletion histories by statistical phylogenetics

    Get PDF
    The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.Comment: 28 pages, 15 figures. arXiv admin note: text overlap with arXiv:1103.434

    The Genome Sequence DataBase: towards an integrated functional genomics resource

    Get PDF
    During 1998 the primary focus of the Genome Sequence DataBase (GSDB; http://www.ncgr.org/gsdb ) located at the National Center for Genome Resources (NCGR) has been to improve data quality, improve data collections, and provide new methods and tools to access and analyze data. Data quality has been improved by extensive curation of certain data fields necessary for maintaining data collections and for using certain tools. Data quality has also been increased by improvements to the suite of programs that import data from the International Nucleotide Sequence Database Collaboration (IC). The Sequence Tag Alignment and Consensus Knowledgebase (STACK), a database of human expressed gene sequences developed by the South African National Bioinformatics Institute (SANBI), became available within the last year, allowing public access to this valuable resource of expressed sequences. Data access was improved by the addition of the Sequence Viewer, a platform-independent graphical viewer for GSDB sequence data. This tool has also been integrated with other searching and data retrieval tools. A BLAST homology search service was also made available, allowing researchers to search all of the data, including the unique data, that are available from GSDB. These improvements are designed to make GSDB more accessible to users, extend the rich searching capability already present in GSDB, and to facilitate the transition to an integrated system containing many different types of biological data

    Nascent RNA sequencing reveals a dynamic global transcriptional response at genes and enhancers to the natural medicinal compound celastrol

    Get PDF
    Most studies of responses to transcriptional stimuli measure changes in cellular mRNA concentrations. By sequencing nascent RNA instead, it is possible to detect changes in transcription in minutes rather than hours and thereby distinguish primary from secondary responses to regulatory signals. Here, we describe the use of PRO-seq to characterize the immediate transcriptional response in human cells to celastrol, a compound derived from traditional Chinese medicine that has potent anti-inflammatory, tumor-inhibitory, and obesity-controlling effects. Celastrol is known to elicit a cellular stress response resembling the response to heat shock, but the transcriptional basis of this response remains unclear. Our analysis of PRO-seq data for K562 cells reveals dramatic transcriptional effects soon after celastrol treatment at a broad collection of both coding and noncoding transcription units. This transcriptional response occurred in two major waves, one within 10 min, and a second 40-60 min after treatment. Transcriptional activity was generally repressed by celastrol, but one distinct group of genes, enriched for roles in the heat shock response, displayed strong activation. Using a regression approach, we identified key transcription factors that appear to drive these transcriptional responses, including members of the E2F and RFX families. We also found sequence-based evidence that particular transcription factors drive the activation of enhancers. We observed increased polymerase pausing at both genes and enhancers, suggesting that pause release may be widely inhibited during the celastrol response. Our study demonstrates that a careful analysis of PRO-seq time-course data can disentangle key aspects of a complex transcriptional response, and it provides new insights into the activity of a powerful pharmacological agent
    • …
    corecore